AITopics | approximately apply

Collaborating Authors

approximately apply

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective

Neural Information Processing SystemsJan-22-2025, 09:18:58 GMT

One of the biggest reasons that I am not too thrilled about the submission is that the two-layer fully connected neural network model under consideration is off standard; the weights of the second layer is fixed to binary (i.e. 1, -1 with uniform scaling) and are not being updated via the gradient descent procedures. Correct me if I am wrong; this assumption is neither commonly adopted in experiments nor identical to the comparable theoretical works such as [DLL 18] or [ADH 19]. If the authors are convinced of the significance or general applicability of the suggested framework, they should have taken more care communicating those to the audience. A relatively minor issue is about the significance of the c_1 term in Theorem 3. The authors nicely demonstrate that the constant c_1 satisfying (13) can be controlled via a sum whose dominating term is inversely proportional to (lambda_{m_l} - lambda_{m_l 1}), which is later formalized to Theorem 4. (By the way, I had trouble locating a formal definition of eps(f *,l).) The question is, can we guarantee that the value is large enough to ensure the ignorability of the term c_1? I am particularly worried about this, as the authors have already mentioned that the spectrum of the random matrix concentrates as n grows, in line 145.

approximately apply, functional approximation perspective, learning over-parameterized neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback